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Summary: 

The  objective  of  this  paper  is  to  explore  the  effects  of  increasing  the 
amount  of  monetary  error  when  the  audited  amounts  are  generated  by  a  stochastic 
model.   The  increased  variance  when  there  is  a  material  monetary  error  makes  it 
desirable  to  design  the  sample  in  a  manner  different  from  current  practice. 


Using  a  Stochastic  Model  in  Stratified  Sampling 

Introduction.  To  use  stratified  sampling  techniques  to  test  for  the  poten- 
tial existence  of  a  material  monetary  error,  the  auditor  must  select  a 
basis  for  stratification  and  determine  an  appropriate  sample  size.   The 
auditor's  objective  is  to  select  a  sample  size  that  will  maintain  the 
sampling  risk  (3)  and  risk  of  overauditing  (a)  at  tolerable  levels. 
Accomplishing  this  objective  requires  knowledge  of  the  sampling  distribu- 
tion  of  X,  the  estimated  total  audited  amount,  under  two  conditions:  when 
the  population  of  recorded  amounts  contains  no  monetary  error  and  when 
total  monetary  error  equals  M,  a  p re-determined  material  amount.  A  com- 
plicating factor  is  that  as  the  amount  of  monetary  error  increases,  the 
standard  deviation  of  audited  amounts  (or  difference  amounts)  may  change. 
The  purpose  of  this  paper  is  to  present  a  planning  methodology  that  takes 
this  complication  into  account. 

Model.   A  useful  way  of  studying  the  effects  of  monetary  errors  on  the 
sampling  distribution  of  the  estimated  total  audited  amount  is  to  use  a 
superpopulation  model.  From  a  superpopulation  viewpoint  the  observed 
audited  amounts  are  assumed  to  be  the  realized  outcomes  of  a  prescribed 
random  process.  Superpopulation  models  have  a  long  history  in  the 
sampling  literature.  Early  users  are  Cochran  (1939,  1946),  Deming  and 
Stephan  (1941)  and  Madow  and  Madow  (1944). 

A  model  for  the  audited  amounts  in  the  population  is 
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X.  =  (1  -  e,)Y  ,       j  =  1,  ...,  N. 

-J  "J   J 

In  this  model,  the  recorded  amount  (Y.)  is  not  a  random  variable,  but  the 
associated  audited  amoxint  (X.)  is  a  realization  of  a  random  process.  The 
random  variable  X,  is  generated  from  the  recorded  amount  Y,  by  multipli- 
cation by  the  factor  (1-6.).   6.  is  a  random  variable  which  takes  on  the 
value  zero  (0)  with  probability  (1  -  it),  and  with  probability  it,  takes  a 
value  in  the  support  of  F,  a  distribution  function. 

Kaplan  (1973A)  used  a  similar  model,  but  he  seems  to  have  regarded 
the  recorded  amounts  as  also  being  random  variables.   If  the  support  of 
F  is  the  interval  (0,1],  6.  represents  the  relative  amount  of  overstate- 
ment associated  with  the  recorded  amount  Y. .   If  F  is  a  jump  function  with 

a  single  jvmip  at  one  (1),  the  audited  amount  X.  is  Y  with  probability 

"J     J 

(1  -  it)  and  zero  (0)  with  probability  tt. 

Using  the  notation  that  E.  represents  the  expectation  operator  with 
respect  to  the  random  variable  6,  the  following  relationships  hold: 

mm 

(1)  EgXj   =  Y^(l  -  TTyg)   where  y^   =  JedF(e) 
and 

(2)  ^e^^^j   "    ^1  "  ■"1^9  ■*■  TfopVarY  +   (.^VqH  -  ttPq)   +  ^o^n"^ 


ZU     -  X)^ 

where  Var  X  =  — "■J   „  " —      aZ 

N  ,        8 


(6  -  p  )   dF(9),   and  the  symbol  =  means 

0 


approximate  equality.  The  approximation  is  obtained  by  substituting  one 

N  —  1 
(1)  for  the  quantity  ( — - — ^) ,  and  so  the  terms  on  the  right  somewhat  over- 
state the  expected  variance. 
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The  above  relationship  shows  that  the  expected  variance  of  audited 
amounts  exceeds  the  variance  of  recorded  amounts  whenever 

2  2 

(3)  CV^Y)  <  -^ ^ ^  =  -^ ^ ^, 

2 
where  CV  (Y)  denotes  the  square  of  the  coefficient  of  variation  of  the  re- 
corded amounts  (Y.).   The  coefficient  of  variation  equals  the  ratio  of  the 
standard  deviation  to  the  mean. 
As  a  special  case,  because 

(1  -  it)  < , 

when  the  support  of  F  is  the  interval  (0,1]  (all  monetary  errors  are  errors 
of  overstatement),  it  follows  that  the  expected  variance  of  audited 
amounts  exceeds  the  variance  of  recorded  amounts  whenever 

(4)  CV^(Y)  <  (1  -  tt). 

A  further  result  that  easily  follows  from  (2)  is  that  for  it  fixed, 

the  expected  variance  of  audited  amounts  is  least  among  all  distribution 

2 
functions  F  with  a  fixed  mean,  y  ,  when  a  =  0.   That  is,  for  a  fixed  mean 


and  fixed  proportion  of  recorded  amounts  having  monetary  error,  the  dis- 


tribution  of  relative  error  that  concentrates  at  \i^   yields  the  smallest 


expected  variance. 

Finally,  whenever  tt  <  y  and  monetary  errors  represent  overstatements, 
the  expected  variance  of  recorded  amounts  is  least  among  all  distributions 
concentrated  at  a  single  point  when  that  point  is  one  (1). 
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Similar  results  hold  for  the  difference  amounts 


D.    = 
-J 

Y. 
J 

-li 

= 

e. 

-2 

Y.. 

J 

Specifically, 

^6?J 

= 

^^6^j 

and 

(5)  EgVarD  =  TrCy^  +  ag)VarY  +   (ti(1  -  Tf)^^  +  ^Oq^^^ 

From  (5)   it  follows  that  the  expected  variance  of  the  monetary 
errors  exceeds  the  variance  of  the  recorded  amounts  whenever 

2  2 

Cv2(Y)   <  — ^ 5 5-^. 

2 
Because  the  right-hand  side  of  (5)  becomes  larger  when  a„  is  made 

o 

2 
larger,  we  can  replace  o^   by  Pq(1  -  Uq)  to  obtain  the  following  inequality 

DO         D 

provided  that  all  monetary  errors  represent  overstatement. 

(6)  V^^B  -  (^ViQ)VarY  +  11^^(1  -  ttVq)Y^ 

From  (6)  it  follows  that  the  variance  of  the  recorded  amounts  will  exceed 
the  expected  variance  of  monetary  errors  when 

CV^(Y)  >  iryg. 

Application.   Currently,  many  auditors  plan  the  extent  of  a  statistical 
substantive  test  by  analyzing  the  recorded  amoxmts.  A  common  procedure  is 
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to  form  L  strata  using  the  CUMF  technique,  determine  the  sample  size  using 
the  standard  deviations  of  the  stratum  recorded  amounts,  and  allocate 
the  total  sample  size  to  the  strata  using  Neyman  allocation. 

For  this  procedure  to  produce  a  sample  size  that  is  appropriate  for 
the  purposes  of  maintaining  the  sampling  risk  (6)  and  the  risk  of  over- 
auditing  (a)  at  prescribed  tolerable  levels,  the  standard  deviation  of 
recorded  amounts  in  each  stratum  must  be  sufficiently  large.  The  requisite 
size,  of  course,  depends  upon  the  statistical  technique  involved. 

Conceptually,  the  auditor  may  formulate  an  hypothesis  that  the  total 
monetary  error,  D,  equals  (or  exceeds)  a  material  amount,  M,  and  test  this 
hypothesis  against  the  alternative  that  the  total  monetary  error  is  less 
than  a  material  amount.  Specializing  this  to  a  test  for  overstatement 
errors,  the  following  representation  obtains: 

H  :  D  >_M 
H^:   D  <  M, 

where  D  represents  the  total  overstatement  error  and  M  represents  a  material 
amount  of  monetary  error  (note  that  D  =  Y  -  X,  where  Y  represents  the  total 
recorded  amount  and  X  the  total  audited  amount). 

To  test  this  hypothesis  the  recorded  amounts  may  be  divided  into  L 
strata,  a  stratified  random  sample  selected  and  evaluated  to  produce  an 
estimated  audited  amount,  X^  or  equivalently,  an  estimated  total  error 
amount  D  =  Y  -  X.   The  auditor  determines  the  sample  size  so  that  his 
risks  are  within  tolerable  levels.   Specifically,  a  decision  rule  is  de- 
sired with  the  properties  that 
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Pr{decide  there  is  no  material  errorJD  =  M}  =  g 

and 

Pr {decide  there  may  be  material  error JD  =  0}  =  a 

where  3  and  a  are  the  specified  tolerable  levels. 

Finding  an  appropriate  decision  rule  involves  knowing  the  sampling 

A  A 

distribution  of  X  (or  D)  both  when  D  =  M  and  when  D  =  0.   If  the  sampling 
distribution  is  degenerate  at  D  =  0,  then  D  =  E,  where  E  represents  an 
insignificant  monetary  error,  may  be  used  in  place  of  D  =  0, 

A  difficulty  that  has  been  observed  is  that  the  standard  error 
of  X  (or  D)  depends  upon  the  magnitude  of  the  monetary  error  present  in 

A  A 

the  population.     The  symbol  (J^(X)    (or  a   (D))  will  be  used  to  denote  this 
standard  error.      If  this  standard  error  were  known,    and  if  the  sampling 

A  A 

distribution  of  X  (or  D)  is  approximately  normal,  the  decision  rule  and 
sample  size  could  be  determined  as  follows: 

Decide  there  is  no  material  error  whenever 

Y  -  X  +  Zga^(X)  <  M 

and  whenever  this  is  not  satisfied,  decide  that  there  may  be  a  material 
amount  of  monetary  error. 

Select  the  sample  size  so  that 

Pr{Y  -  X  +  Zg<Jji(X)   >_m1d  =  0}  =  a. 


^ 


e  use  of  the  symbols  3  and  a  conforms  to  common  usage  in  auditing. 
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Using  the  assumed  normality  of  the  sampling  distribution,  this  will  be 
satisfied  whenever 

a  (X) 

whenever  o^(.X)   >_  a„(X),   the  sample  size  could  be  conservatively  determined 
so  that 

p  a 


whenever  a„(X)  >  a_(X),  the  sample  size  could  be  conservatively  deter- 
mined  so  that 


M  * 

=  a„(X). 


2r  +  2    "M 
p    a 

Alternatively,  the  decision  rule  could  be  stated  directly  in  terms 

A  A 

of  D  and  the  corresponding  a  (D) .  If  the  sampling  distribution  is  degen- 
erate for  D  =  0,  then  the  condition  D  =  0  may  be  replaced  by  D  =  E. 

A 

Using  a  stratified  mean  estimate  (Xj.„)  of  the  total  audited  amount, 
it  is  required  that  the  stratum  standard  deviation  of  recorded  amounts 
be  at  least  as  large  as  the  stratum  standard  deviation  of  audited  amounts 
when  the  total  monetary  error  is  material  in  amount.  When  this  is  not 
so,  the  actual  sampling  risk  may  exceed  the  nominal  sampling  risk. 

The  fact  that  the  actual  sampling  risk  often  exceeds  the  nominal  sampl- 
ing risk  has  been  observed  by  many  authors.  The  reason  most  often  cited  for 
this  is  that  too  few  errors  are  observed  in  the  sample  (see  Neter  and  Loebbecke 
(1975).   Kaplan  (1973B)  offers  the  explanation  that  there  is  correlation 
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between  the  magnitude  of  the  estimate  of  total  monetary  error  and  the 
estimate  of  the  standard  error.   The  explanation  offered  here  is  that 
one  reason  for  underestimating  the  standard  error  is  that  the  sample  size 
may  be  too  small  and  not  appropriately  allocated  to  the  strata. 

To  illustrate  this,  suppose  that  a  population  consisting  of  10,000 
items  has  a  total  recorded  amount  of  $1,000,000.  Dividing  these  items 
into  two  strata  based  on  the  recorded  amounts  might  produce  the  following 
results: 


Mean  Square  of 

Population   Recorded  Standard  Deviation  Coefficient 

Size Amount  of  Recorded  Amounts  of  Variation 

Stratum  1     6000        50  13  .0676 

Stratum  2     4000       175  20  .0131 


If  the  auditor  specifies  a  material  amount  M  =  $50,000  and  a  =  3  =  .05, 
the  raquired  sample  size  based  on  the  recorded  amounts  is  129.  Optimum 
allocation  to  the  strata  gives  64  to  stratxun  1  and  65  to  stratum  2. 

Assuming  that  all  monetary  errors  are  overstatements,  and  that  the 
monetary  errors  are  randomly  distributed  among  the  10,000  items,  the 
smallest  expected  variance  of  recorded  amounts  occurs  when  .05  of  the 
items  are  100  percent  overstated.  Using  equation  (2) ,  it  follows  that 
the  expected  standard  deviation  of  audited  amounts  in  stratum  1  is  16.71 
and  in  stratum  2  is  42.83.  Using  these  values,  the  required  sample  size 
is  368,  with  136  allocated  to  stratum  1  and  232  to  stratum  2. 

In  this  example  both  the  total  sample  size  of  129  is  too  small  and 
the  relative  amount  of  the  sample  allocated  to  stratum  2  is  much  smaller 
than  warranted. 
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In  practice,  of  course,  the  auditor  does  not  know  the  value  of  the 
standard  error  of  the  estimate  when  there  is  a  material  amount  of  monetary 
error  in  the  population.  Nevertheless,  certain  courses  of  action  are 
possible 

1.  Design  the  sampling  plan  so  that  oAX)   will  be  at  least 

A  A 

as  large  as  o^(X)  (or  a^(D)). 

A 

2.  Design  the  sampling  based  on  an  estimate  of  a„(X)    (or 

To   implement  either  suggested  action  the  auditor  needs  to  consider 
the  possible  distribution  of  monetary  error  supposing  the  total  monetary 
error  equals  a  material  amount.      From  equation   (1),    it  follows  that   if 
monetary  eirrors  are  randomly  distributed  among  the  population  items, 

(7)  ttPq   =  Y 

This  relationship  implies  that  the  proportion  of  items  in  error  may  range 

M 
from  —  up  to  100  percent. 

To  study  the  effect  of  the  distribution  of  6,  the  following  class  of 

distributions  is  considered  for  the  case  of  overstatement  errors. 

f  (e)  =  ce^"""",   0  <  e  <_  1,  c  >  0. 

c                  2                      c 
For  this   class  of  distributions,   y.   =  — r—r-  and  a     = . 

^      '^^  ^  e   (c  +i)2(c  +  2) 

The  shape  of  the  distribution  is  determined  by  the  parameter  c.   In  the 
range  0  <  c  <  1,  the  density  function  decreases  monotomically  over  the 

interval  0  <  8  <^  1.   For  example.  Figure  1  is  a  plot  of  the  density  when 

1 

C  =  J. 


Figure  1  about  here 
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When  c  =  1,  the  distribution  is  uniform  over  the  interval  0  <  6  <_  1,  and 
when  c  >  1,  the  density  increases  over  the  interval  0  <  6  <_ 1.  Figure  2 
is  a  plot  of  the  density  for  c  =  3. 


Figure  2  about  here 
In  the  limiting  case  (c  ->■  +«) ,  the  distribution  is  entirely  concentrated 
at  6  =  1.  This  corresponds  to  having  all  monetary  errors  be  100  percent 
overstatement  errors. 

From  (2),  the  expected  variance  of  the  audited  amounts  is  given  by 
the  following  expression 

(8)       E.VarX  =  (1  -  tt—^  +  tt f )Var  Y 

®   -        "^  ^  Cc  +  D^c  +  2) 

(c  +  1)  (c  +  2) 


To  obtain  the  value  of  this  expected  variance  when  there  is  a  material 

c  M 

amount  of  monetary  error,   we  set  ir — —r-  =  -y'      This  yields 

M           iM2                                             ^1-ilS^ 
(9)  E  VarX  =    (1  -  f  +  ^ 2_X.  )varY  +   (f (1  -  ^)   +  ^ 2^-  n^ 

^  TT    Y-*  '^  TT     Y-* 


The  expected  variance  increases  as  ir   increases  from  its  minimum  value 
Y 


M 
of  —  to  1.     At  TT  =  1,   the  expected  variance  is 


^1  -  — )  (1  -  — ) 

(10)  Eg  VarX  =    (1  -  |)  (1  + ^)VarY  +  |(1  -  |)  (1  +  ^)Y^ 

2   -  Y  ^  "  Y 

while  at   IT   =  — •, 

(11)  Eg  VarX  =    (1  -  |)VarY  +  |(1  -  |)Y^ 
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^(o)-3e"   o<e^i 


10     e 


PlGU^^     ^ 
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Rewriting   (10)    in  the  following  way  allows  us  to  see  how  much  larger  the 
maximum  expected  variance  is 

EgVarX  =    (1  -  Y)VarY  +  |(1  -  |)Y^  +  ^ ^[VarY  +  Y^] 

The  following  table  exhibits  the  values  of  the  factor r; —  for  various 


values  of  —, 


M  M2 

M  Y^  Y^ 

Y  2   -^ 

Y 


.01  .0049 

.02  .0097 

.03  .0143 

.04  .0188 

.05  .0231 

.06  .0273 

.07  .0314 

.08  .0353 

.09  .0390 

.10  .0426 


In   the  previous   numerical  example,   we  stated   that   the  smallest  ex- 
pected standard  deviation  of  audited  amounts  are  16.71   in  stratum  1  and 
42,83  in  stratum  2.     With  the  model  just  described,    the  largest  expected 
standard  deviations  of  audited  amounts  are  17.06  in  stratxun  1  and  44.83 
in  stratum  2.     Using  these  maximum  amounts,  the  required  sample  size 
increases  to  394,  with  143  allocated  to  stratum  1  and  251  to  stratum  2, 

This  numerical   example  assumes  that   the  estimated  audited  amount 
is  to  be  based   on  a  stratified  mean  estimate  of   the  sample  audited 
amounts.     We  now  consider  the  situation  when  the  auditor   intends  to 
design  the  sample  based  on  using  a  stratified  difference   estimator.      From 
(6) ,  we  have 
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E  VarD  =  ir(u^  +  a^)VarY  +  (Tr(l  -  Tr)nf  +  Tiaf)Y^. 
6-96  t)  0 

c                   2                       c 
Setting  p„   =  — r-rr  and  a.   =  = ,   this  becomes 

®        ^  "^  ^  ^        (c  +  l)^(c  +  2) 

E.VarD  =    (^(rrjrr)^  +  ti 1 )VarY 

^       -  *^  "^  -^  (c  +  D'^Cc  +  2) 

+  (ir(l  -  tt)  (— ^)^  +  i< • 2^ n^ 

^^  ^  (c  +  l)^(c  +  2) 

c  M 

Imposing  the  condition  that  ir — r— ^  =  — ,  we  have 

(12)  E.VarD  =   [^(f)^  +  ^       "  !  ^     ]VarY  +   [  (|)  2(1  -  i)  +  I— 1^X_]y2 

^^  "  n   Y-*  '•'^  "  IT  Y-* 

M 
M  Y 

where  ir  can  range   from  =-(c  =  +")    to  l(c  =  —) , 

Y 
Because  E  VarD  is  a  strictly  decreasing  function  of  tt   in  this  range,   the 
6         « 

M 

maximum  value  of  E  VarD  occurs  at  ir  =  =-(c  =  +») ,  and  the  largest  expected 
Standard  deviation  is 

(13)  EgVarD  =  |varY  +  |(1  -  |)y2 

This  result  demonstrates  that  the  upper  bovmd  cited  in  (6)  is  achieved  in 
this  class  of  distribution.  Using  the  same  numerical  example,  the  maximum 
value  under  this  model  for  the  standard  deviation  of  difference  amounts 
in  stratum  1  is  11.28,  and  in  stratum  2,  38.40.  The  required  sanqjle 
size  using  these  amounts  is  247  with  76  allocated  to  stratum  1  and  171 
allocated  to  stratimi  2, 

As  expected,  the  stratified  difference  estimator  requires  a  smaller 
sample  size  than  the  stratified  mean  estimator.  In  fact,  the  sample  size 
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required  by  the  stratified  difference  estimator  is  less  than  the  sample 
size  required  by  the  stratified  mean  estimator  even  when  the  latter  uses 
the  smallest  possible  variance  under  the  model.   However,  when  compared 
to  the  sample  size  computed  using  the  recorded  amounts,  the  stratified 
difference  estimator  requires  more.  The  reason  for  the  increase  is  that 
the  stratification  of  the  recorded  amounts  resulted  in  the  top  stratum 

having  the  square  of  the  coefficient  of  variation  equal  to  .0131  which 

M 
is  far  less  than  —  =  .05. 

Sample  Planning.   Because  both  the  decision  rule  and  the  sample  size 
depend  upon  the  standard  error  when  the  monetary  error  is  a  material 
amount  (D  =  M) ,  planning  should  take  this  into  account.  When  all  mone- 
tary errors  represent  overstatement  and  the  errors  are  randomly  distri- 
buted among  the  population  items,  the  auditor  might  plan  on  the  basis 
of  the  largest  possible  standard  error.  Two  possible  ways  of  doing  this 

are  to  maintain  the  square  of  the  coefficient  of  variation  larger  than 

M 

^  within  each  stratum,  or  to  use  an  upper  bound  for  the  variance  of 

differences  within  each  stratumm. 

To  implement  the  first  of  these,  it  would  be  necessary  to  carry 
out  the  stratification  process  only  to  the  point  that  the  square  of  the 

coefficient  of  variation  within  each  stratum  is  at  least  as  large  as 

M 

^.   For  the  second,  the  bound  displayed  in  (6)  could  be  used  to  repre- 
sent the  expected  standard  deviation  of  differences  within  each  stratum. 
Limited  empirical  tests  of  this  on  Neter  and  Loebbecke's  population  4 
would  suggest  that  increasing  the  number  of  strata  beyond  4-6  does  not 
produce  any  savings  in  sample  size.  Table  2  below  shows  the  sample  size 
found  using  the  computer  program  PLAN2  described  in  Roberts  (1978)  for 
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a  =  B  =   .05,   desired  precision  equal  to   $295,125  and  the  population  of 
differences  equal  to    .10. 

Nvunber  of   Strata  Sample  Size 

2  284 

5  194 

7  184 

9  180 

12  178 

An  additional  advantage  of  planning  the  sample  using  either  of  these  pro- 
cedures  is  that   the  auditor  can  evaluate  the  sample  results.      For  example, 
suppose  L  strata  have  been  used  and  planning  has  been  based  on  the  mayimim 
standard  deviation  of  differences.      In  this   case,   the  maximum  standard 
error  is  as   follows 


°M<°s'  =/  '^iWi  -  "i'^^ -T^ -^ 


The  decision  rule  is  to  decide  there  is  no  material  error  whenever 

This  decision  rule  may  be  used  regardless  of  the  number  of  errors  found 
in  the  sample. 


M/C/175 
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